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, A new class of exact-repair regenerating codes is constructed by combining two layers of erasure 

ON ■ correction codes together with combinatorial block designs, e.g., Steiner systems, balanced incomplete 

block designs and i-designs. The proposed codes have the "uncoded repair" property where the nodes 
participating in the repair simply transfer part of the stored data directly, without performing any 
^ , computation. The layered error correction structure makes the decoding process rather straightforward, 

and in general the complexity is low. We show that this construction is able to achieve performance better 



than time-sharing between the minimum storage regenerating codes and the minimum repair-bandwidth 
regenerating codes. 



I. Introduction 
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Distributed data storage systems can encode and disperse information (a message) to multiple storage 
nodes (or disks) in the network such that a user can retrieve it by accessing only a subset of them. This 
kind of systems is able to provide superior reliability performance in the event of disk corruption or 

X ■ 

network congestion. In order to reduce the amount of storage redundancy required to guarantee such 
reliability performance, erasure correction codes can be used instead of simple replication of the data. 

When the data is coded by an erasure code, data repair (e.g., due to node failure) becomes more 
involved, because the information stored at a given node may not be directly available from any one the 
remaining storage nodes, but it can be nevertheless reconstructed since it is a function of the information 
stored at these nodes. One key issue that affects the system performance is the total amount of information 
that the remaining nodes need to transmit to the new node. Consider a storage system which has a total 
of n storage nodes, and the data can be reconstructed by accessing any k of them. A failed node is 
repaired by requesting any d of the remaining nodes to provide information, and then using the received 
information to construct a new data storage node. A naive approach is to let these helper nodes transmit 
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sufficient data such that the underlying complete data can be reconstructed, and then the information 
that needs to be stored at the new node can be subsequently generated. This approach is however rather 
wasteful, since the data stored at the new node is only a fraction of the complete data. 

Dimakis et al. in 12 provided a theoretical framework, namely regenerating codes, to investigate the 
tradeoff between the amount of storage at each node (i.e., data storage) and the amount of data transfer 
for repair (i.e., repair bandwidth). It was shown that for the case when the regenerated information at 
the new node only needs to fulfill the role of the failed node functionally (i.e., functional-repair), but not 
to replicate exactly the original information content at the failed node (i.e., exact-repair), the problem 
can be converted to an equivalent network multicast problem, and thus the celebrated network coding 
result ||2] can be applied. By way of this equivalence, the optimal tradeoff between the storage and 
repair bandwidth was completely characterized in |fl] for functional-repair regenerating codes. The two 
important extreme cases of the optimal tradeoff, where the data storage is minimized and the repair 
bandwidth is minimized, are referred to minimum storage regenerating (MSR) codes and minimum 
bandwidth regenerating (MBR) codes, respectively. The problem of functional-repair regenerating codes 
is well understood and constructions of such codes are available ID, Q, El- 

The functional -repair framework implies that the repair rule and the decoding rule in the system 
may evolve over time, which incurs additional system overhead. Furthermore, functional repair does not 
guarantee the data to be stored in systematic form, which is an important practical requirement to consider. 
In contrast, exact-repair regenerating codes do not have such disadvantages. The problem of exact-repair 
regenerating codes was investigated in — ifTTI . all of which address either the MBR case or the MSR 
case. Particularly, the optimal code constructions in and Q show that the more stringent exact-repair 
requirement does not incur any penalty for the MBR case; the constructions in (6]-[|8] show that this is 
also true for the MSR case. These results may lead to the impression that enforcing exact-repair never 
incurs any penalty compared to functional repair. However, the result in O shows that this is not the 
case, and in fact a large portion of the optimal tradeoffs achievable by functional-repair codes can not 
be strictly achieved by exact-repair codes'^ . 

Codes achieving tradeoff other than the MBR or the MSR points may be more suitable for a system 
employing exact-repair regenerating codes. From a practical point of view, codes achieving other tradeoff 
points may have lower complexity than using the time-sharing approach, because the MSR point requires 
interference alignment, and it is known to be impossible for linear codes to achieve the MSR point 

'One may question whether exact-repair codes can asymptotically approach these tradeoffs, however in 11161 it is shown that 
there indeed exists a non-vanishing gap between the optimal functional-repair tradeoff and the exact-repair tradeoff. 
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for some parameters without symbol extension |]6j. As such, it is important to find such codes with 
competitive performance and low complexity. However, it is in fact unknown whether there even exist 
codes that can achieve a storage-bandwidth tradeoff better than simply time-sharing between an MBR 
code and an MSR code. In this work, we provide a linear code construction based on the combination of 
two layers of erasure correction codes and combinatorial block designs, which is indeed able to achieve 
tradeoff points better than the time-sharing between an MBR code and an MSR code. The two erasure 
correction codes are not independent, which must be jointly designed to satisfy certain full rank conditions 
to guarantee successful decoding. In this work we mainly focus on the case when d = n — 1, i.e., when 
the repair requires the access to all the other storage nodes , however it can indeed to generalized to the 
case d < n — 1. 

The conceptually straightforward code construction we propose has the property that the nodes par- 
ticipating in the repair do not need to perform any computation, but can simply transmit certain stored 
information for the new node to synthesis and recover the lost information. The uncoded repair property 
is appealing in practice, since it reduces and almost completely eliminates the computation burden at 
the helper nodes. This property also holds in the constructions proposed in [5] and |[T2l . In fact our 
construction was partially inspired by and may be viewed as a generalization of these codes. Another 
closely related work is |[T8l , where repetition and erasure correction codes are combined to construct 
codes for the MBR point, and one of constructions indeed relies on Steiner systems. The model in |[T8l 
is however different from ours (and that in flT|), where the repair procedure only needs to guarantee the 
existence of one particular <i-helper-node combination (fix-access repair), instead of the more stringent 
requirement that the repair information can come from any (i-helper-node combination (random-access 
repair). Though both models have their merits, we focus on the more stringent and thus more robust 
random-access repair model in this work. 

The rest of the paper is organized as follows. In Section |TTJ a formal definition is given for the coding 
problem and several relevant existing results are reviewed. Section [III] provides an example to illustrate 
the structure of the proposed construction. Section [TV] provides the general code construction in three 
progressive steps, and in Section fVl the performance is analyzed. Finally [Vl] concludes the paper. 

II. Problem Definition and Preliminaries 

In this section, we first provide a formal definition of exact-repair regenerating codes. Some existing 
results on regenerating codes, basics on maximum separable regenerating codes and block designs are 
also briefly reviewed. 
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A. Definition of Exact-Repair Regenerating Codes 

An (n, k, d) exact-repair regenerating code is a storage system with a total of n storage nodes (disks J^, 
where any k of them can be used to reconstruct the complete data, and furthermore to repair a lost disk, 
the new disk may access data from any d of the remaining n — 1 disks. Let the total amount of raw data 
stored be M units and let each storage site stores a units of data, which implies that the redundancy of 
the system is na — M. To repair a disk failure (regenerate a new disk), each contributing disk transmits 
(3 units of data to the new node, which results in a total of d(3 units of data transfer for repair. It is clear 
that the quantities a and /3 scale linearly with B, because a code can simply be concatenated. For this 
reason we shall normalize them the other two quantities using /3 

- A Oi - A M 

a = -, M = j, (1) 

and use them as the measure of performance from here on. 

Formally, the problem can be defined as follows. The notation I n is used to denote the set {1,2,..., n}, 
and without loss of generality we assume k < d. 

Definition 1: An (n,k,d, N, N^, K) exact-repair regenerating code consists of a total of n encoding 
function ff(-), a total of decoding functions /Jj 5 (•)> a total of nd^ 71 ^ 1 ) repair encoding functions 
Ff A j(-), and a total of n( n ~^ 1 ) repair decoding functions F® A (-), where 

ff:I N ^I Nd , iEl n , (2) 
which map the message m G In to n pieces of coded information, 

f^:I Nd ^I N , AC I n and \A\ = k (3) 
which maps the k pieces of coded information in a set A to the original message, 

F i,Aj '■ -> Ik, 3 G In, AQI n \ {j} and \A\ = d, i e A, (4) 
which maps a piece of coded information to an index that will be made available to the new node, and 

FP A : 4 -> I Nd , j€l n , ACI n \{j} and \A\ = d, (5) 

which maps d of such indices from the helper nodes to reconstruct the information stored at the lost 
2 From here on, we shall use "node" and "disk" interchangably. 



February 20, 2013 



DRAFT 



5 



node. The functions must satisfy the data reconstruction conditions 

sa ( n f^ m ) ) = m ' m g in > ac in and \ a \ = ^ <® 

\ieA J 

and the repair conditions 



^(n^(/fM))=/i 

\i€A J 



Ef m), me I N , j G I n , ACI n \{j} and \A\=d. (7) 



Definition 2: A normalized pair (a, M) is said to be achievable for (n,k, d) regenerating if for any 
e > there exists an (n, k, d, N, Nj, K) code such that 

a + e > — (8) 

log K 

and 

logiV 

M-e<-^— . (9) 

log A 

The quantity e in the definition above is introduced to include the case when the storage-bandwidth 
tradeoff may be approached asymptotically, e.g., the case discussed in (H. 

It is sometimes insightful to consider the case when n is large while k = n — t\ and d = n — T2 
where t\ and r 2 are fixed positive constant integers such that t% > r 2 . For this purpose, the following 
two quantities become relevant. 

Definition 3: An E'-pair (E^, E^) where 

F (n) a log(M - no) (n) A log A? 

log n log n 

is (n, ri, r 2 ) -achievable if (a^ n \ M^) is achievable for (n, n — r l5 n — r 2 ) regenerating. The collection of 
all (n, n, r 2 ) -achievable pairs is denoted as f( n ). The achievable redundancy-data-rate exponent region 
£ is the closure of limsup^^^ £ ( n K 

In Section [V] we shall show that the proposed codes are able to achieve the entire exponent region £, 
while time-sharing between the MSR point and the MBR point can not. 

B. Cut-Set Outer Bound, MBR Point and MSR Point 

As mentioned earlier, the functional-repair regenerating coding problem can be converted to a multicast 
problem, and through this connection, a precise characterization of the optimal storage-bandwidth tradeoff 
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was obtained in [lj] using cut-set analysis. Since exact-repair is a more stringent requirement than 
functional-repair, this characterization provides an outer bound for exact-repair regenerating codes. 
Theorem 1 ( /[7]/ ): Any exact-repair regenerating codes must satisfy the following condition 

fc-i 

^2 min(a, (d — i)) > M. (11) 

i=0 

One extreme case of this outer bound is when the storage is minimized, i.e., the minimum storage 
regenerating (MSR) point, which is 

a=(d-k + l), M = k(d-k + l). (12) 

The other extreme case is when the repair bandwidth is minimized, i.e., the minimum bandwidth regen- 
erating (MBR) point, which is 

, k(2d-k + l) 
a = d, M = '-. (13) 

Both of these two extreme points are achievable J5]-||8] also for the exact-repair case. The functional 
repair outer bound is however not tight in general, which implies that the exact-repair condition will 
indeed incur a penalty in many cases COD. The cut-set outer bound and the two extreme points 
are illustrated in Fig. [T]for (n,k,d) = (9,7,8); note that the bound is piece-wise linear. The segment 
between the MSR point and the origin (0, 0) is given by the trivial bound ka < M, and it is essentially 
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a degenerate regime because to achieve this segment of tradeoff, we can simply utilize an MSR code but 
let the helper nodes send more than necessary amount (i.e., more than (3 units) of data. 

C. Maximum Distance Separable Code 

A linear code of lengfh-n and dimension k is called an (n, k) code. The Singleton bound (see e.g., 
|[T5l ) is a well known upper bound on the minimum distance for any (n, k) code. 

Theorem 2: The minimum distance d m - m for an (re, k) code is bounded by dm - m < n — k + 1. 

An (n, k) code that satisfies the Singleton bound with equality is called a maximum distance separable 
(MDS) code. A key property of an MDS code is that it can correct any (n — k) or less erasures. There 
are many ways to find MDS codes for any given (re, k) values, n > k. For example, any randomly 
generated n x k matrix in a sufficiently large alphabet is a generator matrix for an MDS code with high 
probability. Any n x k Vandermonde matrix can also generate an MDS code when the entries in the 
second column are all distinct. Another explicit construction approach is by puncturing a Reed-Solomon 
code of an appropriate alphabet (see e.g., |fT31l ). 

D. Block Designs 

Block design has been considered in combinatorial mathematics with applications in experimental 
design, finite geometry, software testing, cryptography, and algebraic geometry. Generally speaking, a 
block design is a set together with a family of subsets (i.e., blocks) whose members are chosen to satisfy 
some properties that are deemed useful for a particular application. Usually the blocks are required to 
all have the same number of elements, and in this case a given block design with parameter (re, k) is 
specified by (X, B) where X is an n-element set and B is a collection of fc-element subsets of X. 

One important class of block designs is the i-designs. The class of i-designs with parameter (A, t, r, re) 
is denoted as S\(t, r, n); a valid t-design in S\(t, r, n) is a pair (X, B) where X is n-element set and B 
is a collection of r-element subsets of X with the property that every element in X appears in exactly 
7 blocks and every t-element subset of X is contained in exactly A blocks. Without loss of generality, 
one can always use X = I n , and we shall use this convention from here on. 

The most extensively researched class of block designs is perhaps Steiner systems, which is the case 
when A = 1 and t > 2. In this case, the subscript A is usually omitted and we directly write it as S(t, r, re). 
The simplest design in this class is when t = 2 and r = 3, which is the particularly well understood 
Steiner triple systems 5(2,3, n). It is known that there exists a Steiner triple system 5(2, 3, re) if and 
only if n = 0, or n modulo 6 is 1 or 3; see, e.g., |[T3l . It follows that the smallest positive integer which 
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TABLE I 

Example Steiner triple systems 5(2, 3, 7), 5(2, 3, 9) and 5(2, 4, 13). 



(17,3)6 5(2,3,7) 


{(1, 2, 3), (1,4, 5), (1, 6, 7), (2, 4, 6), (2, 5, 7), (3, 4, 7), (3, 5, 6)} 


(J 9 , B) 6 5(2,3,9) 


{(2, 3, 4), (5, 6, 7), (1, 8, 9), (1,4, 7), (1, 3, 5), (4, 6, 8), 
(2, 7, 9), (2, 5, 8), (1,2, 6), (4, 5, 9), (3, 7, 8), (3, 6, 9)} 


{I 13 ,B) 6 5(2,4, 13) 


{(1, 2, 4, 10), (2, 3, 5, 11), (3, 4, 6, 12), (4, 5, 7, 13), (5, 6, 8, 1), 
(6, 7, 9, 2), (7, 8, 10, 3), (8, 9, 11,4), (9, 10, 12, 5), 
(10, 11, 13, 6), (11, 12, 1, 7), (12, 13, 2, 8), (13, 1, 3, 9)} 



gives us a non-trivial Steiner system is n = 7 and the next is n = 9. Examples of 5(2, 3, 7), 5(2, 3, 9) 
are given in Table U where a design for 5(2, 4, 13) is also included. Another well-known special class 
of £ -designs is Balanced Incomplete Block Designs (BIBDs), which is a special case of i-designs for the 
case t = 2. It is clear that Steiner systems 5(2, 3, n) are also BIBDs. 

For a given (A, t, r) triple, a i-design may not exist for an arbitrary n, however, for any (t,r,n), a 
trivial i-design always exists with A*(i,r, n) = (™Zt)> as given in the following proposition. 

Proposition 1: For any (t, r, n) where t < r < n, a complete block design is a block design where 
the blocks are all the r-element subsets of I n (and the blocks are not repeated). In this design every 
element in I n appears in exactly ( n Zi) blocks and every t-element subset of X is contained in exactly 
A*(i,r, n) blocks. 

We may still refer to such a complete block design for the case of t = 2 as a BIBD, although it is in 
fact a complete block design instead of an incomplete one. The following proposition |fi3j is useful. 

Proposition 2: If (I n , B) is an S\{t, r, n) design and 5 is any s-element subset of I n , with < s < t, 
then the number of blocks containing 5 is 

i{«e»=s=*>i -*(":;) (;:;)"'■ ™ 

The following corollary apparently follows by setting s = in Theorem |2] 
Corollary 1: If (I n ,B) is a S\(t,r,n) design, then the total number of blocks in B is 

N x (t,r,n)±\B\ = \(^(^j . (15) 

For the case of Steiner systems, we shall omit the subscript, and simply write it as N(t, k,n). When 
the parameters are clear from the context, we may also write N\(t,r,n) as N*. 

There are various known constructions, existence results, and non-existence results for Steiner systems, 
BIBDs and t-designs in the literature; interested readers are referred to |fl"3ll and Ifl4ll for more details. 
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TABLE II 

Code constructed using the Steiner triple system 5(2, 3, 9) in Table|I] 



Disk # 


1 


2 


3 


4 


5 


6 


7 


8 


9 




P3 


Xi 


Y X 


Pi 


x 2 


Y 2 


Pi 


^3 


Y 3 




X A 


x 7 


Y 5 


Y 4 


Ps 


Y 6 


Pa 


P& 


Y 7 




x 5 


X 8 


X n 


x 6 


Y 8 


P9 


Pi 


Ps 


Pio 




x 9 


Y 9 


Xl2 


Xio 


Yio 


Y l2 


Y n 


P11 


P12 



III. An Example (9, 7, 8) Code 

To illustrate the basic code components, we shall construct a (9, 7, 8) exact-repair regenerating code 
with M = 23, a = 4 and = 1. The addition and multiplication operations in the encoding and decoding 
are in the finite field F(3), however this choice is only for better concreteness. The construction is based 
on the block design 5(2, 3, 9) given in Table HI 

Let the information be given as a length-23 vector, where the i-th entry is denoted as di € ¥(q). The 
components of this code are given described below. 

Encoding: 

1) Generate a parity symbol d 24 = Yl]=i <hj-i + X^li 2( %; 

2) Pair up (c%-i, d 2 j), and rename it as (Xj,Yj) where j = 1,2,..., 12; i.e., (Xj,Yj) = (d 2 j-i,d 2 j). 

3) Generate a new parity symbol Pj = Xj + Yj, and (Xj,Yj,Pj) will be referred to as a parity group, 
where j = 1,2,..., 12; 

4) For each block Bj = {&j,i, bj t2 , bj^}, j = 1,2,..., 12, in the block design, write one symbol in 
j-th parity group in the 6j,i-th disk, one in the bj t2 -th disk, and one in the &j,3-th disk, respectively. 

One possible resulting code symbol placement is illustrated in Table. ITT] The placement is not unique, 
since within each parity group, the symbols can be permuted arbitrarily. Note that the second step above 
is for facilitating better understanding, and a more concise set of notations will be used in the general 
construction in the next section. 

Repair: 

Let us suppose the first disk fails. To regenerate, for example, symbol X5, first obtain Y5 and P5 from 
disk-3 and disk-5, respectively, and then compute X5 = P5 — I5. Clearly, other symbols on the disk can 
also be repaired following a similar procedure. This procedure also applies to other disk failures. It can 
also be checked that for any disk failure, each remaining disk sends a single symbol during the repair, 
which is in fact guaranteed by the basic property of block designs in this case. 
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Reconstruction: 

For data reconstruction, several different cases need to be considered. Before going into the details 
of these cases, consider a scenario where disk 1 and disk 2 are not accessible. Notice that although 
(Xi , , X^ , X-j , X% , Xg , Yg , P3) are not accessible directly, (X± , X± , X§ , X7 , X^ , P3 ) can be recov- 
ered using the symbols on other disks, as discussed in the repair procedure above; thus parity groups 
1, 2, 3, 4, 5, 6, 7, 8 are not effected. As a consequence, only symbols in the 9-th parity group can not be 
completely recovered, but even in this parity group, Pg is still accessible on disk-6. The reconstruction 
cases can be classified according to which parity group is effected (i.e., cannot be completely recovered 
directly) and which symbol within this parity group is still accessible. 

1) The j-th parity group, i £ In, is effected, but Xj or Yj is still accessible within it. An example 
case is when disk-3 and disk-4 are not accessible. Note X\ = d\ is still available in this case, 
and d\ + 2d,2 can be computed using g^4 = Y12 = Ylj=i (faj-i + Sj=i ^^2j after eliminating 
(d,2j-i,d,2j) pairs for j = 2, 3, . . . , 11 and (I23, from which (d\, d®) can be solved. The information 
vector can be obtained by rearranging the symbols. 

2) The j-th parity group, j € In, is effected, but the parity symbol Pj is still accessible within it. 
An example case is when disk-2 and disk-3 are not accessible. In this case (d2j-i, <%) pairs for 
j = 2, 3, . . . , 12 can be recovered. In addition, P\ = d\ + c?2 and d\ + U2 are available, from 
which (^1,^2) can be solved. 

3) Parity group 12 is effected, but X12 is still accessible. This case is trivial since all dj, j = 
1, 2, . . . , 23 have been directly recovered. 

4) Parity group 12 is effected, but Y\2 is still accessible. In this case (Xj,Yj) = (g^-i,^.?) for 
j = 1, 2, . . . , 11 can be recovered, and thus only c?23 needs to be recovered. But we have Y12 = 
c?24 = Yli=i d-2j-i + 2c?2j> from which d^i can now be obtained. 

5) Parity group 12 is effected, but P\2 is still accessible. Again (d^'-i, d2j) pairs for j = 1, 2, . . . , 11 
can be recovered. Additionally we have P±2 = c?24 + ^23 = Z)j=i <%-i + Ylj=i ^2j + ^23, from 
which c?23 can be obtained. 

Let us compare this code with the time-sharing code using an MBR code and an MSR code. For 
(n,k,d) = (9,7,8), the MSR point is (a,M) = (2,14) and the MBR point is (a,M) = (8,35). Our 
construction achieves (a, M) = (4, 23), while the time sharing performance between the MBR point and 
the MSR point at a = 4 gives M = 21, thus the example construction indeed achieves an improvement 
on M while keeping a the same. 
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This example illustrates the main components in the proposed construction, i.e., a block design, a 
first layer long MDS code, and a second layer short MDS code. The coefficients used in the two parity 
symbols of the two codes cannot be set arbitrarily, for example, if we were to set Pj = Xj + 2Yj, then 
in the second case discussed in the reconstruction procedure, a decoding failure would occur. The basic 
idea is to use the short MDS code to recover as many data symbols as possible which will render most 
of the parity symbols in the short MDS code redundant, and then use the remaining parity symbol in the 
short MDS code together with the parity symbol in the long MDS code to jointly solve the remaining 
unknown data symbol. 

IV. Code Constructions 

In this section, we first describe an explicit code construction for (n, n— 2, n — 1) code based on Steiner 
system 5(2, r,n). This construction however only applies to the case when a Steiner system exists for 
such n, and as aforementioned, Steiner systems may not exist for all (r, n) pairs. Then based on BIBDs 
S\(2, r, n), the method is generalized to the case any (n, k, d) triples such that k < n — 1 and d = n — 1. 
Since a complete block design can be viewed as a special case of BIBDs, the construction applies to any 
value of positive integer n. This construction can be further generalized to the case when d < n — 1, 
which will be discussed briefly. 

A. A Construction Based on 5(2, r, n) 

Given a block design (I n ,B) S 5(2, r, n), the exact-repair regenerating code with parameters (n, k, d) = 
(n, n — 2, n — 1) we shall construct has the following parameters 

a = — , = 1, M = (r-l)JV*-l = "fo" 1 ) -1, (16) 
r — 1 r 

where we have used N* to denote N(2, r, n) for notational simplicity. Note that these parameters are all 
integers for a valid Steiner system, moreover, n(n — 1) is a multiple of r(r — 1), which can be seen using 
Proposition [2] and its corollary. The alphabet for this code can be chosen to be any finite field ¥(q) with 
a field size q > r, and the addition and multiplication operations in the encoding and decoding process 
are performed in this field. 

Let the M information symbols in ¥(q) be given in a (r — 1) x N* matrix except the bottom-right 
entry D r -\^*, which is left blank. The code has several components: 

Encoding: 
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4 : 

4, 



D 



P, 



P, 



j-th parity group 




disk 1 




disk 2 




disk 5 



disk n 



Fig. 2. Code structure based on 5(2, r,n). 



1) Choose (r — 1) distinct non-zero elements 4>i, 4>2, ■ ■ ■ , 4>r-i in JF(<z), which satisfy <pi + 1 / for 
i = 1, 2, . . . , r — 2. Generate a parity symbol and assign it to D r -i N* as 

r-2 TV* JV--1 

*=i i=i j=i 

2) For each column j = 1,2,... , iV*, generate new parity symbols as 

r-l 

A-j = Pi = £Aj. (18) 

i=l 

The collection (Dij, -t>2,j ; ■ ■ ■ , D r _ij, Pj) will be referred to as the j-th parity group; 

3) For each block Bj = {67,1, bj2, • • • , bj >r } G jB, j = 1, 2, . . . , iV*, distribute the symbols in the z-th 
parity group onto disk bj ±, bj^, • • • , bj >r , one symbol onto each disk. 

Repair: 

Suppose disk-m fails. In order to recover the symbols on this disk, find in B all blocks Bj such that 
m E Bj. Recall there are a total of a such blocks, and let them be denoted as B^ , B^ 2 , ■ • • , B& a . For 
each of this block B^,, I = 1,2, ... ,a, obtain the symbols in the parity group ki from the disks in the 
set Bj, l \ {m}, and recover the symbol in this parity group on disk-j using the relation (fT8l ). 

Reconstruction: 
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Several cases need to be considered, when two disks have failed: 

1) The j-th parity group loses two symbols which are the parity symbol Pj and one data symbol 
Di j, and the other parity groups each lose one symbol or less. This implies that the other parity 
groups can recover all its data using (QjO, and thus only Dij needs to be recovered. It can be 
obtained through D r _i^*, by eliminating in ( fTTT ) the symbols in the other parity group, and then 
eliminating D^j, k ^ i. 

2) The j-th parity group loses two symbols which are two data symbols D^j and Di 2 j, and the 
other parity groups each lose one symbol or less. Other data symbols can be obtained as in the 
previous case, and only D^j and D{ 2 j need to be recovered. Since D r _i : jv* is still available, by 
eliminating the symbols in the other parity group in (fTTT l, and then eliminating D^j, k 7^ i\ and 
k 7^ %2, we obtain faD^j + faDi 2 j. By eliminating D^j, k 7^ i\ and k 7^ i 2 in (fT8T ). we obtain 
Di lt j + Di 2 ,j- Since fa 7^ fa and they are both non-zero, Di u j and Di 2> j can be solved using 
these two equations. 

3) Parity group N* loses two symbols, which are the parity symbols Z? r _i jv» and Pn*. This case is 
trivial since all data symbols have been directly recovered. 

4) Parity group N* loses two symbols, which are the parity symbols Pn* and a data symbol D^n*, 
1 < i < r — 2. By eliminating the symbols in the other parity group in D r ^.\,N* using (fT71) . and 
then eliminating Dk.N* » k 7^ i, we obtain Z^jv* - 

5) Parity group N* loses two symbols, which are the parity symbols D r -i N* and a data symbol 
Di,N* , 1 < i < r - 2. Note that P N * is still available and 

r- 1 r-2 Af* iV*-l r-1 

Pat, = ^ A,AT. = X) ^ E A J + D r-l,3 + D ^ N ' ■ (19) 

i=l i=l j=l j=l i=l 

By eliminating the symbols in the other parity group in Pj\r* and then eliminating Df.^*, k 7^ i, 
we obtain + l)Di t w* for some 1 < i < r — 2, and since + 1 7^ for such i, D^n* can be 
correctly obtained. 

The code construction is illustrated in Fig. |2] In the disk repair and data reconstruction procedure given 
above, we have inherently assumed that the following two facts hold: 

• Fact one: During the repair, each remaining disk conttibutes exactly one symbol; 

• Fact two: When two disks are not accessible, only one parity group has two inaccessible symbols, 
and the other parity groups each have only one symbol or less inaccessible symbol. 

These are indeed true by invoking the basic property of Steiner system, more precisely, that any pair of 
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elements in /„ appears exactly in one of the blocks in B. 

The long code in the construction is an (M + 1,M) systematic MDS code whose parity symbol is 
specified by ( fTTl ), and the short code is a (r, r — 1) systematic MDS code whose parity symbol is specified 
by (fT8l) . It should be noted that the coefficients in forming the parity symbols are certainly not unique, 
and we have only given a convenient choice here. In many cases, the performance of codes is better than 
time-sharing between MSR and MBR points, however, we leave the detailed analysis to the next section 
to avoid repetition. 

B. A Construction Based on BIBDs S\(2,r,n) 

In this subsection, we generalize the construction previously described to the setting of exact-repair 
regenerating codes for any positive integer n, d = n — 1 and any k <n — 1, based on BIBDs S\(2, r, n). 
The validity of the construction relies on application of the Schwarz-Zippel lemma, which is used to show 
that there exists a valid choice of long MDS code when the alphabet is large than a given threshold. 

First fix a BIBD (I n ,B) £ S\(2,r,n), and again denote N\(2,r,n) as N*. First define the quantity 

T{A) = l-B n A| - 1, (20) 

B£B:\BnA\>2 

where A C I n and \A\ = n — k, then further define 

T= max T(A). (21) 

A:ACI n , \A\=n-k 

The relevance of this quantity will become clear shortly. When n — k = 2, the definition of BIBDs gives 
T = A. The construction given in the previous subsection belongs to this case with T = A = 1. In 
general, the quantity is dependent on the particular block design, and does not appear to have an explicit 
formula, however, we shall discuss a bound on this quantity in the next section. 
The code we construct has the following parameters 

q= A(n " 1) , /3 = A, M = (r-l)N*-T= Xn{n - 1) -T. (22) 
r — 1 r 

Note that although a is always an integer, (n — 1) is not necessarily a multiple of r — 1 here, unlike in 
the previous construction. This implies that a may not be an integer. 

Let the M information symbols in ¥(q) be given in a vector d, and use it to fill the first M entries 
in a (r — 1) x N* matrix D following the column-wise order, i.e., the first column (top-down), and the 
second column, etc.; the rest of the T-entries of the matrix are left blank. The code requires a matrix S 
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disk 5 



disk n 



Fig. 3. Code structure based on S\(2,r,n) 



of size T x M, whose entries are also in ¥(q). The matrix S is used to generate the parity symbols for 
the long MDS code, and we shall specify the condition for S shortly. 
Encoding: 

The encoding procedure is similar to the procedure given in the previous subsection, with the only 
difference being that we first compute the multiplication S ■ d and then fill the rest of D matrix using 
the resultant T parity symbols in a column-wise manner. 

Repair: 

The repair is precisely the same as the repair procedure given in the previous subsection. Note that 
each remaining disk contributes exactly A symbols, which is implied by the definition of S\(2,r,n). 
Reconstruction: 

Let (n—k) disks in the set A be inaccessible, where \A\ = n—k. For each parity group j = 1, 2, . . . , N*, 
construct a length-r vector Zj as follows 

• If Bj n A < 1: collect, and if necessary, compute using (fT8l ), the symbols Dij, i = 1, 2, . . . , r — 1; 
let Zj = (Dx :j ,D 2 ,j, . . . ,A._ij,0)*; 

• If Bj C\A > 2: collect the available symbols in this parity group, denoted as (-D^j, Di 2 j, . . . , Di u j), 
assign Di u j,Di 2 j, . . . ,Di t j to the ii , *2 5 - - - 5 *z positions of vector zj, and let the rest of Zj be 
zeros. 
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Finally let cLa = \z\ , z% , ■ ■ ■ , zjv* ] > concatenate the vectors Zj's. The entries of cLa are linear 
combinations of d. Our claim is that by properly choosing S, the vector d can be reconstructed from 
J,a for any possible set A. 



This construction is illustrated in Fig. [3] from which the difference and similarity from the construction 
given in the previous subsection is straightforward. For the case r = 2, the proposed construction is 
precisely the repair-by-transfer construction in |0. In this case, the parity symbol Pj is a simple repetition, 
and the S\(2,2,n) design is when A = A* = 1 in the trivial complete design. 

Next we show that a matrix S with the desired properties indeed exists. Note that as long as the 
transfer matrix between d,A and d has rank M, the information vector d can be correctly reconstructed. 
To identify this matrix, first construct a template matrix R of size r x (r — 1) as R = [1, 1*]*, where / 
is the identity matrix, and 1 is the all one vector of length r — 1. For each j = 1,2, ... , N*, construct 
matrix Rj of size r x (r — 1) as follows, 

• If Bj fl A < 1, let Rj be R with the last row set to all zeros; 

• If BjC\A = I > 2, then let the corresponding symbols in the i-th parity group stored on disks Bj \A 
be Di 1 , Di 2 , . . . , D- H . Keep the rows i i , %i , . . . , ii in i?, and assign the other rows as all zeros, and 
let the resultant matrix be Rj. 

Finally form a matrix Qa of size (rN*) x (r — 1)N* using matrix Rj's as the diagonal, i.e., 

Ri 



Qj 



R2 



R N , 



(23) 



Clearly, we have 



Q A - G ■ d = d A , 



(24) 



where G = [I, 5*]*. Thus as long as Qa ■ G has rank M for each set A C I n such that \A\ = n — k, the 
information vector d can be correctly decoded no matter which n — k disks are inaccessible. We have 
the following proposition. 

Proposition 3: Among the q™ distinct assignments of S, at most a fraction of q~ 1 (2)TM may 
induce a matrix Qa ■ G with rank less than M for some A C I n such that \A\ = n — k. 

Proof: The proof is a direct application of the Schwartz-Zippel lemma in its counting form. For 
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each A C I n such that \A\ = n — k, if we can show that the fraction of assignments resulting in 
rank(QA • G) < M is bounded by q~ 1 TM, then the bound given in the proposition is obtained by a 
simple union over all choices of A. To show this, first remove the all-zero rows in Qa, and then remove 
the first T — T(A) rows in the remaining matrix, resulting in a matrix Q'a- Note that Q'a ■ G is of size 
M x M, and thus as long as Q'a ■ G has full rank, the matrix Qa ■ G has rank M. However, Q'a • G 
having full rank is equivalent to &et{Q' a ■ G) ^ 0. Since det(Q'^ ■ G) is a polynomial g(-) of the entries 
of S, as long as <?(•) is not identically zero, we can apply the Schwartz-Zippel lemma and conclude the 
proof. The polynomial g(-) is indeed not identically zero, which is proved in the appendix. ■ 
As a consequence of this proposition, when q > (Z)TM, there exists at least one valid choice of 
matrix S; in fact, when q is sufficiently large, almost all the assignments of S are valid. The problem of 
explicitly constructing the matrix S is open, however it may not be as complex as it seems. One possible 
approach is to let S be the parity portion of a systematic MDS code generator matrix, and then check 
whether the full rank conditions are satisfied for each possible set A C I n with \A\ = n — k, which is a 
total of (?) conditions. 

C. A Construction Based on t-Designs S\(t,r,n) 

The code construction for exact-repair regenerating codes presented in the previous section can be 
generalized to the case d < n — 1, by using general t-design instead of BIBDs. The resulting codes 
may require different amounts of data contributions from disks during repair, and thus do not strictly 
belong to the class of codes defined in Section ITT] For this reason, instead of considering per-disk rate 
ft during repair, we shall only consider total repair bandwidth 7 here. A special class of code, based on 
complete block designs S\*{t,r,n), can be made symmetric by time-sharing among different repair rate 
allocations, as shall be discussed shortly. 

Given a particular t-design (I n , B) G S\(t, r, n), we shall construct an exact-repair regenerating codes 
of parameter (n, k, d) using (X, B), where d = n — t + 1 and k < d. Similarly as in the last subsection, 
define the quantity 

T{A)= \BnA\-t + l, (25) 

BeB:\BnA\>t 

where A C I n and \A\ = n — k, then further define 

T= max T(A). (26) 

A:AQI n , \A\=n-k 
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Fig. 4. Code structure based on S\(t,r,n). 



The code we construct has the following parameters (note instead of /3, here 7 is given) 

Mr!) 

J=(r-t+l)a, M = (r — t + l)N\(t, r, n) — T. (27) 

u-i) 

The difference from the construction given in the last section is that instead of only one parity symbol, 
t — 1 parity symbols Pij, P2J, ■ ■ ■ , Pt-i,j are generated in the parity group j, using a fixed systematic 
(r, r — t + 1) MDS code; see Fig. 01 Any symbol on a failed disk has a maximum ofn — d = t — 1 
symbols from the same parity group that are not participating in the repair, however the (r, r— MDS 
code guarantees that this symbol can be recovered using the remaining at least r — t + 1 symbols on the 
other disks. Note that the amounts of data contributions from these disks are in general not symmetric, 
although there may be many choices to choose which symbols to use in the repair. Similarly as in the 
previous case, it can be shown that there exists a matrix S of size T x M which guarantees correct 
decoding in a sufficiently large alphabet, and thus we omit the details to avoid repetition. 

One particularly interesting case is when the complete block design S\* (t, r, n) is used. In this case, 
although the data contributions from the disks during repair may not be symmetric, one can always 
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time-share among different helper rate contribution allocations, due to the symmetry of the code. Thus 
this time-sharing version of the codes based on complete block design indeed belongs to the class of 
codes defined in Section [H] 

V. Performance Analysis 

In this section, we analyze the performance of the proposed codes more systematically. Recall that the 
quantity T includes an optimization problem, and it is block design dependent. Thus in general (a,/3, M) 
of a particular code can not be explicitly evaluated. However, for the codes based on complete block 
designs, the performance can indeed be explicitly evaluated. Moreover, for any given (r, n), a complete 
block design 5a* (2, r, n) in fact offers the best performance among all possible BIBDs S\(2,r,n), in 
terms of the normalized measure (a, M), which is shown next. 

A. The Optimality of Complete Block Designs 

Proposition 4: Given an (n, k, d) code C\ constructed using t-design of (I n , B) G S\(n — d + l,r, n), 
which achieves (a, Mi), and an (n,k,d) code C2 constructed using the complete design S\*(n — d + 
l,r,n), which achieves (a, M2). Then M\ < M2, with inequality holds if and only if in (I n ,B) the 
quantity Ta is uniform for all set A C I n with \A\ = n — k. 
Proof: 

The T function on both (J n , B) G S\{n — d+ 1, r, n) and on the complete design S\* {n — d + 1, r, n) 
need to be considered, and in order to distinguish them, we shall the latter as T*. 

Consider the block design resulting from a permutation tt of the elements of I n , which also operates 
on the blocks in (I n , B), and denoted the T(A) function on this permuted block design as T W (A), and the 
T function on this permuted block design as T n . Construct a new and larger block design by taking all 
the blocks resulting through the n\ permutation of the design (/„, B); note that there might be repetition 
of the same blocks, which is allowed in this new design. We shall denote the T(A) function operating 
on this new design as T p (A), and the corresponding T function as T p . 

This new compound design is apparently a complete block design where each block is repeated ^ 
times, and thus 

T p = IT 3 " (28) 
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However, since it is the combination of n\ permutation of the block design (I n ,B), we also have 

T p = T p (I n _ k ) = £ T n (I n _ k ) <Y J T 7T = Y J T = n\T. (29) 

TT TT TT 

Thus X*T > XT*, which when combined with dT5]> and ([27]), gives Mi < M 2 . Clearly equality holds if 
and only if T,,- = T for all 7r, which is equivalent to T(A) = T for all A C I n with [A| = n - k. 
The proof is complete. 

■ 

Although complete block designs provide the best (a, M) among all the t-designs in the same class, 
other incomplete block designs may lead to simpler code, as illustrated in the following example. 

Example: Consider a code based on the complete block design 57(2,3,9), and thus T = X* = 7. 
Using the general code construction based on BIBDs, we have a (9, 7, 8) exact-repair regenerating code 
with 

a = 28, = 7, M = 161. (30) 

The 5 matrix for the first layer code in this case is of size 7 x 161, i.e., 7 parity symbols generated by 
161 information symbols. In contrast, in the example given in Section Hill also a (9, 7, 8) exact-repair 
regenerating code, has a first layer code with only a single parity symbol, generated by 23 information 
symbols. Note however both code achieve the same normalized measure (a,M) = (4,23). 



B. Performance Analysis Using Complete Block Designs 

Recall that S\* (t, r, n) can be used to construct codes with different k values, where t = n — d + 1. 
With complete block designs, the value T can be explicitly evaluated as follows using the symmetry 

T = T(I n _ k )= ]T |Bn/ n _ fc |-t + l 
BeB-.\Bni n - k \>t 

\B C\ I n - k \ - n + d 

B&B:\BnI n - k \>n-d+l 

mm(n-k,r) . . . . 

It is clear that the code has a normalized a as 

a = A — -., (32) 

r — n + a 
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and a normalized M as 

M = — ^^v- (33) 

Clearly for a. > 0, we need r > n — d + 1. Since at the MSR point, a = d — fe + 1, it is more meaningful 
to choose 

r <n-d+ d (34) 
a — k + 1 

However, choosing r greater than this value is also valid, which may yield codes that although not 
efficient in terms of (a, M), but nevertheless useful due to its simplicity. 

There does not seem to be any simplification for specific (n, k, d) parameters. We provide a few 
examples to illustrate the performance of the code for various (n, k, d). In Fig. [5j we plot the performance 
of the proposed codes for the case of (n, k, d) = (9, 7, 8), and for reference the cut-set bound and time- 
sharing line are also included. It can be seen that in addition to the code example given in Section Jill 
there is one more parameter r = 4 that yields a performance above the time-sharing line; the proposed 
code also achieves the point (8,35), which is not surprising since in this case it reduces to the optimal 
construction in 0. The operating point (q, M) = (2, 13.4) is worth noting, because although it is not as 
good as the MSR point (2, 14), the penalty is surprisingly small. This suggests that the proposed codes 
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Fig. 6. Performance of the proposed codes for different (k, d) parameters when n = 24. The dashed blue lines are the cut-set 
bounds, the dotted black lines are the time-sharing lines, and the red solid lines are the tradeoff achieved by the proposed codes. 



may even be a good albeit not optimal choice to replace an MSR code, particularly when such MSR 
codes have high complexity. 

In Fig. [6] we plot the performance of codes for different parameters (k, d) when n = 24. It can be 
seen that when d = n — 1 = 23, the performance is the most competitive, and often superior to the time 
sharing line. As d value decreases, the method become less effective in terms of its (a, M), and becomes 
worse than the time-sharing line. For the same d value, the code is most effective when k is large, and 
becomes less so as k value decreases. It should be noted that the lower left corner is the MSR point, and 
in a wide range of parameters the proposed scheme in fact operates rather close to it, despite the simple 
coding structure. 

C. An Asymptotic Analysis 

In this subsection, we consider the asymptotic performance of the proposed codes when n is large. 
Recall the case under consideration is when k = n — t\ and d = n — T2 where t\ and T2 are fixed 
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Fig. 7. The achievable exponent region £ and the exponent region achieved by time-sharing. 

constant integers. We have the following theorem. 

Theorem 3: Let £* be the collections of (E r , E^) pair such that 

E d <E r + l, 2E d <2 + E r , E d <2, 



(35) 



Then £* = £, and moreover, £* can be asymptotically achieved by the proposed code construction. 

Proof: We first show that £ C £* by utilizing the cut-set bound in Theorem Q] For better clarify, 
we shall write (q,M) for a fixed n explicitly as (a( n \ M^). First notice that the bound implies that 
for any integer c G [0, A;] 

k-l c-1 k-1 



(2d-fc- C + l)(fc-c) 
ca v ' H . 



(36) 



i=0 

Taking c = gives 



i=0 



n) ^ (2d-k + l)k 



(37) 



which implies that 



E ( n) < logM (n) < log(2d -k + l)k- log 2 



log n log n 

log(2d — k + l)k — log 2 log(d + n - r 2 + l)Jfe - log 2 



log n 



log 77, 



(38) 
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and thus for any (E r , Ed) € £, Ed < 2. By taking c = k, we have 



(39) 



which implies that 



nS W-M(»)>!L*M(n). 



(40) 



It follows that 

(n) log(nqW-MW) log s^m(") _ log(n - fc) - log fc (n) 
logn log re log re 

and thus E r — E^ > — 1 for any (E r , E^) £ £ . Next rewrite (|36T > when as follows 



na 



in) _ M {n) > ( n _ c ) 



(2d-fc-c+l)(fc-c) 



. M (2d-k-c+l)(k-c) 

^ (n " c) T 2 



(41) 



(42) 



where the second inequality is due to (1391 . Rearrange the right hand side of the above inequality to be 
a quadratic function in c, we have 

c 2 (2d + l-^) (2d — Ar + l)fc nM 



nd n) - M (n) > - — + 
~ 2 



+ 



A- 



1, 2d + l M n9 1 2d+l M x9 (2d-fc + l)lfc nM 

> (c ' 1 ) + - ' ) - - ■ — - — I 

" 2 [ 2 + k ' 2 { 2 k ' 2 + k ' 



(43) 



When the following condition holds 



M'"' > (d-k + -)k, 



(44) 



that is 



P (n) >. log(d - fe + 
log n 



(45) 



we can choose 



2d + 1 M 
2 



(46) 



Febraary 20, 2013 



DRAFT 



25 



and arrive at the bound 

nfi<»> - mm > - x - + l -( M±i - £y» - + + !^ 

- 2 2 V 2 fc y 2 fc 

^ 2 1/j na n / j 1,M 3 

= 2^ + 2 (d - fc) + 2 ( "- fc) + (n - (i -2 ) T-8 
M 2 3 

2 2F - 8- <4?) 
This implies that when d45l ) is true, for any sufficiently large n and any 5 > 

E (n) > 2 E^ n) - 2 - 5, (48) 

and thus E r > 2E^ — 2 when > 1. This completes the converse proof. 

For the forward proof, we shall first fix a quantity < e < 1, and consider a sequence of code 
n = no, no + 1, . . .. From (l32l . we get 

a<») = — = = (49) 
r — n + a n e — T2 



Note that 



min(n— fe,r) , . . 

n — k\ I k 



z I \ r — i 

i=t \ / \ 

min(n-fc,r) . . . 

* E «-*>(":%*,) 

min(n-fe,r) , 7 \ / 7 \ 

- ) E (";%-,) 

* )E("7lC-^ 



(ri-r 2 )( n ) (54) 
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Thus from d33T> we have 



MW = ^ (55) 

> (56) 

nd , . dn ,_„ N 

= ri - r2 W_- (57) 

r (r + a — njr 

= -fl-T^") (58) 



r \ (r - t 2 ) 



T2 

.1-6 



> n i - e (n-r 2 ). (60) 

This implies that for any 8 > and sufficiently large n, there exists a code using the proposed design 
such that Ejf } > 2 - e - 5. 
Furthermore, we have 

-(„) _ j£(„) < n^nCn-ra) 



n e — r 2 



which implies that for any 5 > and sufficiently large n, there exists a code using the proposed 

(n) 

design such that Er < 2 — 2e + 5. Because the region £ is a closed set, it follows that the pair 
(E r , Ed) = (2 — 2e, 2 — e) is achievable for any 1 > e > 0, and thus the region 2E^ < 2 + E r is achievable 
for any 2 > E r > 0. The case (E r , E^) = (0, 1) can be simply addressed by taking a sequence of e m 
such that e m — > 1; the case of (E r , E^) = (2,2) can be addressed similarly. The regions when E r > 2 
and E r < are degenerate and they are easily shown achievable either by increasing unnecessarily the 
redundancy in the code, or increasing unnecessarily the amount of repair bandwidth. The proof is thus 
complete. ■ 
For comparison, let us consider the time-sharing scheme between an MSR and an MBR code. Note 
that the MSR point corresponds to (E r , E^) = (0, 1) and MBR point corresponds to (E r , E^) = (2,2). 
For a code using a time-sharing weig ht 6>( n ), the rate can be bounded as 

M(") = *W(„ - nXn - r 2 + 1) + (1 - e (n) ) (n-n)(n + r 1 -2r 2 + l) 

$( n ) 

< (ti - r 2 + \)n + (n 2 - n - n(n - 2r 2 + 1)), (62) 
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and the redundancy can be bounded as 



na 



(n) _ M (n) = Q{n) Tx(jl _ ^ + y + ^ _ fl('«))( n ( n _ T ^ 

> 2 ^ - ra - r i(n - 2r 2 + 1)). 



(n - ri)(n + t\ — 2t 2 + 1) 
2 



(63) 



For any sequence of such a time-sharing codes index by n, we have 




log (MW - ( Tl - T2 + l)n) 



(64) 



log 77, 



which implies that for any 5 > and sufficiently large n, 



p(n) > An) _ r 



(65) 



and it follows that E r > when > 1. Using (l62l ) and (l63l . we can also write 



(n) log(wq( n ) - Mg + ( Tl - T2 + 



(66) 



log 77, 



which implies that when E r < 1, we must have E^ < 1 using the time-sharing strategy. See Fig. [7] for 
an illustration of this region. 

The gap between 6 and the time-sharing scheme shows that the improvement of our proposed codes 
over the time-sharing scheme increases with n, and it can be unbounded. 



A new construction for (n, k, d) exact-repair regenerating codes is proposed by combining two layers of 
error correction codes together with combinatorial block designs. The resultant codes have the desirable 
"uncoded repair" property where the nodes participating in the repair simply send certain stored data 
without performing any computation. We show that the proposed code is able to achieve performance 
better than the time-sharing between an MSR code and an MBR code for some parameters. For the case 
of d = ?7 — 1 and k = n — 2, an explicit construction is given in a finite field ¥(q) where q is greater or 
equal to the block size in the combinatorial block designs. For more general (d, k) parameters, we show 
that there exist systematic linear codes in a sufficiently large finite field. 

Appendix 

In this appendix, we prove det{Q' a • G), as a function of the entries of the matrix S, is not identically 
zero. For this purpose, we shall revisit the matrix Qa- Set the first T — T(A) non-zero rows in Qa to be 
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all zeros, and denote the resulting matrix as Q* A ; let us omit the subscript A from here on for simplicity. 
If there exists an assignment of S such that Q* ■ G has rank M, then clearly det(Q' • G) 7^ 0, since Q' 
is simply Q* without the all-zero rows. 

Recall the matrix Q* is of size (rN*) x (M + T) with M non-zero rows, and the matrix G is of size 
(M + T) x M, where M + T = (r - 1)N*. Let the quotient and the remainder of M divided by (r - 1) 
be a and b, respectively, i.e., M = a{r — 1) + b. Let us partition the matrix Q* into four sub-matrices as 



Q* 



Qn 


Q12 




Qn 


_ Q21 


Q22 _ 




Q21 Q22 



(67) 



Q* - G = 


Qn 




I 












Q21 Q22 




S 





where Qn is of size (ar + b) x (a(r — 1) + b), which implies Q12 is an all-zero matrix. It follows that 

Qn 

(68) 

Q21 + Q22 • S 

In order to show Q* • G has rank M, our plan is to specify an auxiliary matrix H, which satisfies the 
following two conditions. 

• Condition one: the matrix [Q^^H 1 ] has rank M; 

• Condition two: the equation H = Q21 + Q22 ■ S has a valid solution for S 1 . 
Clearly, if both these two conditions hold, the proof is essentially complete. 

We start by first assuming that there exists at least one all-zero row in the bottom r — b rows of R a +i', 
the other case will be addressed shortly. A set of (a + 1) intermediate marries Hi, H2, ■ ■ ■ , H a+ \ shall 
be constructed as follows. For j = 1,2, ... ,0, find the all-zero rows in Rj, and denote the indices as 
li, I2, ■ ■ ■ , l e ", if ej > 2, then the matrix Hj is of size (e, — 1) x M, where the i-th row has all zeros 
except the (j — l)(r — 1) + k position, which is assigned 1. For j = a + 1, find the all-zero rows in the 
first b rows of R a +i, denote the indices as Zi, I2, ■ ■ ■ , le a+1 ', the matrix H a+ \ is of size e a+ \ x M, where 
the i-th row has all zeros except the a(r — 1) + k position, which is assigned 1. The matrix H, which 
has the same size as Q21 + Q22 ■ S, if formed by first assigning all zeros to the rows that are all zeros 
in [Q21, Q22], then assign the rows of H±, H2, ■ ■ ■ , H a+ \ into the remaining rows of H in any order. 

We have inherently assumed above that the total number of rows in H\, H2, ■ ■ ■ , H a+ \ is the same 
as the number of rows in [Q2\,Q22] that have non-zero entries. This is indeed true because the former 
together with the number of rows in [Qn,Qv2\ that are have non-zero entries totals to M, while the 
latter also satisfies this relation. 

To see that condition one holds, notice that in matrix H, by exchanging the rows (j — l)r + l±, (j — 
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l)r -H2, • • • , (j — 1)?" + l e — 1 in Q11 the rows corresponding to the first — 1 all-zero rows in Rj) 
and the rows of Hi, each block matrix Rj can have rank r — 1; by similar operation, the top 6 rows of 
matrix i? a+ i is an identity matrix. Due to the block diagonal structure of the matrix Qn, this indeed 
implies that the matrix [Q\i, H l ] has rank a(r — 1) + b = M. 

To see that condition two also holds, we solve for S block by block. First consider the block i?7v* in 
Q22, and assume N* > a + 1. Due to the block structure of Q22, the determination of the last (r — 1) 
rows of S only depends on Rn* and the last r rows of H, but not any other entries in Q* and H. Let us 
denote the sub-matrix consisting of the last r — 1 rows of S as S', and denote the sub-matrix consisting 
of the last r rows of H as H 1 . The problem essentially reduces to finding a solution for Rn* ■ S' = H' . 
Denote the indices of the rows which have non-zero entries in R^* as ■ ■ ■ ,i e , then the column 

span of Rn* is the space spanned by columns with a single 1 at the 11,12, ■•• ,i e positions; this further 
relies on the structure of Rn* and the fact that there is at least one all-zero row in it. It follows that the 
column span of H' is in the column span of Rn*, and thus there indeed exists a solution for S'. Repeat 
this process for the other blocks, as well as the partial block of R a +i in Q22, a solution for S is found. 

The case that there exists no all-zero row in the bottom r — b rows of R a +i introduce the complication 
that for the partial block of R a +i in Q22, because in this case its column span is one-dimension less 
than the space spanned by columns with a single 1 at the desired positions. However, the only change 
required is the following: the (r — 6)-th row of H is chosen to be the summation of its first (r — b — 1) 
rows and the (r — 6)-th row of Q2i- It is straightforward to check that both condition one and condition 
two can still be made to hold for this case. The proof is complete. ■ 
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